Alfredo Barretto, University of Buenos Aires,
acbarrettomdp@gmail.com PRIMARY
Nelson Amaya , University of Buenos Aires, nelmaya@gmail.com
Juan Orlowski, University of Buenos Aires,
orlowski@agro.uba.ar
Student Team: YES
Did you use data from both
mini-challenges? NO
Tableau
Excel
Sql Server
Infostat
Spss
Approximately
how many hours were spent working on this submission
in total?
60
May we post
your submission in the Visual Analytics Benchmark Repository after VAST
Challenge 2015 is complete? YES
Video:
Questions
MC1.1 – Characterize the attendance at DinoFun World
on this weekend. Describe up to twelve different types of groups at the park on
this weekend.
a.
How big is this type
of group?
b.
Where does this type of
group like to go in the park?
c.
How common is this
type of group?
d.
What are your other
observations about this type of group?
e.
What can you infer
about this type of group?
f.
If you were to make
one improvement to the park to better meet this
group’s needs, what would it be?
We selected 8 big groups in
order to describe most popular patterns of attractive attendance. Additionally,
we have also selected 4 small groups in order to
describe uncommon patterns that could be related to the crime.
Among the big groups we have:
Clusters 17, 11 and 2 are composed by 668, 458
and 557 IDs respectively. They mainly
showed preference by Thrill Rides (TR) and avoided Kiddie Land (KL) games
(Figure 1). Clusters were comprised by 5.9, 4.0 and 4.9% of
the IDs that attended this weekend. While cluster 11 also showed main
preferences for Shows and Entertainments (S&E) of Coaster Alley (CA) zone
and secondary preferences for Rides for Everyone (RfE)
games and food store 31, cluster 2 showed also main preferences for S&E
attractive 64 and secondary preferences for RfE
games. It can be inferred that most of these IDs are people who like adventure
and risk, that attended with no children. Improvements
that we suggest are 1) to place a rest room near Game 3 (TR) because the
nearest rest room is about 350 meters from this game, 2) to settle beer gardens
between attractive 48 - 31, 81 – 4 and 32 – 57 because the other beer places
are far of these zones.
Clusters 5 and 24 are composed
by 465 and 554 IDs respectively. Both clusters showed main preference by TR
games and while cluster 5 also preferred all S&E games cluster 24 only
preferred S&E 64 and 32 (Figure 1). Clusters were
comprised by 4.1 and 4.9% of the IDs that attended this weekend. Both
clusters showed a regular intermediate preference to the other games). It can be inferred that some of these IDs attended with
children and that tried to visit all the attractives
of the park. We suggest to place a S&E attractive
between TR games 5 and 3 as they are far away from a S&E attractive.
Clusters 10, 20 and 15 are composed by 440, 589
and 459 IDs respectively. They showed
clear preferences for some, not all, TR games and for other attractives.
While cluster 10 showed preferences for TR games 1, 6, and 4, and for S&E
attractive 64, cluster 20 mainly liked to go to TR game 7, to Wet Land (WL) TR
and to TR game 5 of the Tundra Land (TL). In a same way, cluster 15 preferred TR games 8,
7, 1 and 3, S&E attractive 32 and Kiddie Ride (KR) game 11. Clusters were comprised by 2.3, 5.2 and 4.0% of the IDs that
attended this weekend. Other observations about these clusters are that 1) cluster 10
showed intermediate preference to games 12, 15, 18, 8, 32, 63, 23, 81, 3, 5 and
26, 2) cluster 20 showed intermediate preference to CA games and minor preference
to KL and TL games, 3) cluster 15 showed minor preference to games 12, 6, 30,
31 and 62 and intermediate preference to the rest of the games. It can be inferred that the people of these clusters attended
with children that, in the case of cluster 15, preferred mainly Game11. We
suggest to place another rest room and food store between games 4 and 6 because there is low concentration of them in this
place.
Among the small groups we
have (improvements not be given as we considered unjustifiable to make changes
to the park only for minor groups):
Cluster 3 was composed by 33
IDs. It is characterized for showing equal preference
for the games whose members attended. They showed preference for KR games 11,
14, 17, for all S&E games, for RfE games 23, 22,
25, 26, for TR games 81, 3, 5 and for food store numbered 31. It was comprised by 0.3% of the IDs that attended this weekend.
This group showed no preference for restroom 49, and for information store 62.
It can be inferred that it is a group of people that follows
the same tour as check in frequencies are identical.
Cluster 9 was composed by 80
IDs. They showed preference for KR games 9, 10, 11, 13, 14 and 15, for TR games
2, 6, 7 and 5, and for RfE of TL zone, mainly games
5, 25, 26. It was comprised by 0.7% of the IDs that
attended this weekend. This group showed no preference for KR
games 12, 17, 18 and 19, for S&E attractives 64
and 32, for RfE games 24, 30, 22 and 27, for TR games
4 and 81, for restroom 49 and for the information store 62.This group of people
came to the park with children and, as there are several games with equal
frequency, it is composed by a group of people that follows a same tour.
Cluster 19 was composed by 56
IDs. They showed preference for several games (KR 17, S&E 64, TR 2, 6, 8, 3
and 5, and RfE 23, 26 and 27) and all zones. It was comprised by 0.5% of the IDs that attended this weekend.
This group showed no preference for S&E of CA zone and for information
store 62. As there are several games (7) with equal
frequency it can be inferred that in this cluster there are also a group of
people that follows a same tour.
Cluster 13 was composed by 79
IDs. They showed preference for KR games 16, 17 and 18, for TR Games 8, 81 and
5, and for RfE games 21, 22 and 27. It was comprised by 0.7% of the IDs that attended this weekend.
This group showed no preference for KR games 9, 10, 11 and 19, for TR games 1
and 2, for RfE games 29, 20, 23, 25, 26 and 28, for
S&E attractive 32 and 63, for food store 31, for restroom 49, and for
Information store 62. This group of people came to the park with children and,
as there are several games (5) with equal frequency, is
composed by a group of people that follows a same tour.
Figure 1. Selected clusters indicating the relative
attendance of their members to different games. Clusters 2, 5, 10, 11, 15, 17,
20, 14 are big clusters while clusters 3, 9, 19, 13 are small.
MC1.2 – Are there notable differences in the patterns of activity on in the park
across the three days? Please describe
the notable difference you see.
One of the notable differences
is that the amount of people in the park that checked in increased from Friday
to Sunday. On Friday there were detected the presence of 47.033 different IDs
that checked in while on Saturday and Sunday this amount increased to 73.767
and 84.183, respectively. This increase is seen in all games except S&E attractives 32 and 63 (Figure 2A).
Additionally, the
relative preference of IDs to the games was similar between Friday and Saturday
but differed regarding Sunday (Figure 2B). On Sunday, relative preferences to
S&E attractives 32 and 63 of the CA zone were
smaller than the other days as well as happened with TR game 4 of WL. On the
contrary, relative preferences to TR games 1, 2, 6,
and 8 of CA zone and TR game 63 of TL zone was higher than the other days.
Space use was, in
general terms, different between Saturday and Friday and between Sunday and
Friday while it was similar between Sunday and Saturday (Figure 3). On Saturday
and Sunday, there were more relative frequency of people in most of the north
zone of the park than on Friday (Figure 3 upper panels - highlighted in red)
while on Friday there were more relative frequency of people in most of the
south zone of the park (Figure 3 upper panels - highlighted in blue). Sunday
differed mainly from Saturday in the smaller relative frequency of people at
attractive 32 (Figure 3 lower panel – highlighted in blue).
Figure 2. Absolute (A)
and relative (B) attendance of different IDs to the park attractives
during Friday, Saturday and Sunday of the weekend under study.
Figure 3. Comparison of
space use through the relative frequency difference, between pair of days, of the
relative amount of detected signals (movements, check ins)
in a given day and square.
Figure 4. Relative frequencies
of check ins per hour by game and day.
We also noticed that
there is a schedule difference between days (Figure 4). On Friday, activities
started at 8 A.M and ended at 9 PM while on Saturday and Sunday they ended at
12 A.M. Another point to be highlighted is that there
are two contractions at 10 AM and 3 PM. This is because there is no check in
registered in Creighton Pavilion between 10 AM - 11 AM and between 3PM - 4PM in
any day. There is also no activity in Grinosaurus
Stage between 10 AM - 1 PM and from 4 PM on. Additionally, activity in the park
had his highest peak around 4 pm on Friday and Saturday, while on Sunday this
occurs at 11AM. After those hours, it seems people used less the attractions
and check ins started to decrease until the close of
the park. Regarding uncommon behaviors, there is a considerable quantity of
check-ins in Raptor restroom between 8AM and 9 AM which
possibly means there is another entry to the park in this place.
MC1.3 – What anomalies or unusual patterns do you see? Describe no more than 10
anomalies, and prioritize those unusual patterns that you think are most likely
to be relevant to the crime.
Regarding space use,
the southernmost part of Creighton pavilion (attractive 32) showed lower
relative use than a normal day while the northwest part of the park showed
relative higher use. The main anomaly was the relative lower use of attractive
32, although also lower use, but of less degree, was detected in attractives 4, 5, 7, 63 and 81 (Figure 4A). Last people
entered to the Creighton pavilion at 11 am (Figure 4B). We suspect that at that
time the crime occurred. Additionally, we consider that when a robbery happen
then the thief try to escape with the thing that was robbed avoiding any
control (check in). In this sense, 476 IDs were present at the Pavilion at
11am. Of them, six IDs (20098, 221553, 542027, 746574, 1264352, 1770473) did not make any other check in to other attractive
of the park. These IDs entered together to the park at 08:42:41 and visited the
same games at the same time so we suspect that there is high probability that
the crime was done by this band of thieves. However,
they did not leave the park inmediatly, spending
about 1:15h to exit. Firstly, they went from Creighton pavilion to the exit
returning then back to attractive 48 and they back to the exit (Figure 5).
Another detected anomaly was the existence of some IDs whose last register was a check in and no other activity was detected
hereinafter. As examples we can mention IDs 1932220,
98371 and 1042280.
Another detected
anomaly was regarding movement recording. On Friday, park sensors stopped
working at 20:12:07h (Figure 6) when there still was some people in the park
(Figure 7). Additionally, valid checkouts were not detected for 90,75% of the persons that attended to the park (3228 out of
3557). On Saturday, no movement recording occurred between 23:23:04 and
23:30:19 (Figure 6). On the other hand, only one ID did not register a check
out (ID 1975667). It’s last registered movement was at
22:26:47 (Figure 7). Finally, on Sunday, the last registered movement occurred
at 23:25:13h and the following IDs did not registered a valid check out:
898576, 1095309, 1376114, 2063022, 1336607, 1483705, 227221, 392618, 1722376
(Figures 6 and 7)
Figure 4. A:
Comparison of space use between Sunday and a common/normal day, through the
relative frequency difference of the relative amount of detected signals
(movements, check ins) in a given day and square. B:
Relative frequency of check ins to game 32 per day.
Figure 5. Trajectory of
the suspected band of thieves after the robbery.
Figure 6. Movement
detection for time intervals by day.
Figure 7.People
remaining in park after last movement detected. Dots color indicates time of
the last movement registered.
The last anomaly that
we have detected was regarding irregular movements. On Saturday, ID 1983765
registered movements in different sectors of the park at a given same moment of
time (Figure 8). This behavior began at 20:18:21h and a checkout was also registered for this ID throughout the Raptor
Restroom entrance twice on 20:34:36 and 20:36:07.
Figure 8. Positions
during same times of a given same ID.